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\0 (54) Title: MODIFIED KERATINOCYTE GROWTH FACTOR (KGF) WITH REDUCED IMMUNOGENICTTY 

(57) Abstract: The present invention relates to polypeptides to be administered especially to humans and in particular for therapeu- 
tic use. The polypeptides are modified polypeptides whereby the modification results in a reduced propensity for the polypeptide 

O to elicit 311 immune response upon administration to the human subject. The invention in particular relates to the modification of 
keratinocyte growth factor (KGF) to result in keratinocyte growth factor (KGF) proteins that are substantially non-immunogenic or 

^* less immunogenic than any non-modified counterpart when used in vivo. 
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MODIFIED KERATINOCYTE GROWTH FACTOR (KGF) WITH 
REDUCED IMMUNOGENICITY 



FIELD OF THE INVENTION 

5 The present invention relates to polypeptides to be administered especially to humans and 
in particular for therapeutic use. The polypeptides are modified polypeptides whereby the 
modification results in a reduced propensity for the polypeptide to elicit an immune 
response upon administration to the human subject. The invention in particular relates to 
the modification of human keratinocyte growth factor (KGF) to result in KGF protein 

10 variants that are substantially non-immunogenic or less immunogenic than any non- 
modified counterpart when used in vivo. The invention relates furthermore to T-cell 
epitope peptides derived from said non-modified protein by means of which it is possible 
to create modified keratinocyte growth factor variants with reduced immunogenicity. 

15 BACKGROUND OF THE INVENTION 

There are many instances whereby the efficacy of a therapeutic protein is limited by an 
unwanted immune reaction to the therapeutic protein. Several mouse monoclonal 
antibodies have shown promise as therapies in a number of human disease settings but in 
certain cases have failed due to the induction of significant degrees of a human anti- 

20 murine antibody (HAMA) response [Schroff, R. W. et al (1985) Cancer Res. 45: 879-885; 
Shawier, D.L. et al (1985) J. Immunol. 135: 1530-1535]. For monoclonal antibodies, a 
number of techniques have been developed in attempt to reduce the HAMA response 
[WO 89/09622; EP 0239400; EP 0438310; WO 91/06667]. These recombinant DNA 
approaches have generally reduced the mouse genetic information in the final antibody 

25 construct whilst increasing the human genetic information in the final construct. 

Notwithstanding, the resultant "humanized" antibodies have, in several cases, still elicited 
an immune response in patients [Issacs J.D. (1990) Sem. Immunol. 2: 449, 456; Rebello, 
P.R. et al (1999) Transplantation 68: 1417-1420]. 

30 Antibodies are not the only class of polypeptide molecule administered as a therapeutic 
agent against which an immune response may be mounted. Even proteins of human 
origin and with the same amino acid sequences as occur within humans can still induce an 
immune response in humans. Notable examples include the therapeutic use of 
granulocyte-macrophage colony stimulating factor [Wadhwa, M. et al (1999) Clin. 
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Cancer Res. 5: 1353-1361] and interferon alpha 2 [Russo, D. et al (1996) BrL J. ifcem. 
94: 300-305; Stein, R. et al (1988) New EngL X Med. 318: 1409-1413] amongst others, 

A principal factor in the induction of an immune response is the presence within the 
5 protein of peptides that can stimulate the activity of T-cell via presentation on MHC class 
II molecules, so-called 'T-cell epitopes. Such potential T-cell epitopes are commonly 
defined as any amino acid residue sequence with the ability to bind to MHC Class II 
molecules. Such T-cell epitopes can be measured to establish MHC binding. Implicitly, a 
"T-cell epitope" means an epitope which when bound to MHC molecules can be 
10 recognized by a T-cell receptor (TCR), and which can, at least in principle, cause the 
activation of these T-cells by engaging a TCR to promote a T-cell response. It is, 
however, usually understood that certain peptides which are found to bind to MHC Class 
II molecules may be retained in a protein sequence because such peptides are recognized 
as "self 1 within the organism into which the final protein is administered. 

15 

It is known, that certain of these T-cell epitope peptides can be released during the 
degradation of peptides, polypeptides or proteins within cells and subsequently be 
presented by molecules of the major histocompatability complex (MHC) in order to 
trigger the activation of T-cells. For peptides presented by MHC Class n, such activation 
20 of T-cells can then give rise, for example, to an antibody response by direct stimulation of 
B-cells to produce such antibodies. 

MHC Class II molecules are a group of highly polymorphic proteins which play a central 
role in helper T-cell selection and activation. The human leukocyte antigen group DR 

25 (HLA-DR) ^ the predominant isotype of this group of proteins and are the major focus 
of the present invention. However, isotypes HLA-DQ and HLA-DP perform similar 
functions, hence the resent invention is equally applicable to these. The MHC class II DR 
molecule is made of an alpha and a beta chain which insert at their C-tennini through the 
cell membrane. Each hetero-dimer possesses a ligand binding domain which binds to 

30 peptides varying between 9 and 20 amino acids in length, although the binding groove 
can accommodate a maximum of 1 1 amino acids. The ligand binding domain is 
comprised of amino acids 1 to 85 of the alpha chain, and amino acids 1 to 94 of the beta 
chain. DQ molecules have recently been shown to have an homologous structure and the 
DP family proteins are also expected to be very similar. In humans approximately 70 
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different allotypes of the DR isotype are known, for DQ there are 30 different allotypes 
and for DP 47 different allotypes are known. Each individual bears two to four DR 
alleles, two DQ and two DP alleles. The structure of a number of DR molecules has been 
solved and such structures point to an open-ended peptide binding groove with a number 

5 of hydrophobic pockets which engage hydrophobic residues (pocket residues) of the 
peptide [Brown et al Nature (1993) 364: 33; Stern et al (1994) Nature 368: 215]. 
Polymorphism identifying the different allotypes of class II molecule contributes to a 
wide diversity of different binding surfaces for peptides within the peptide binding grove 
and at the population level ensures maximal flexibility with regard to the ability to 

10 recognize foreign proteins and mount an immune response to pathogenic organisms. 

There is a considerable amount of polymorphism within the ligand binding domain with 
distinct "families" within different geographical populations and ethnic groups. This 
polymorphism affects the binding characteristics of the peptide binding domain, thus 
different "families" of DR molecules will have specificities for peptides with different 

15 sequence properties, although there may be some overlap. This specificity determines 
recognition of Th-cell epitopes (Class II T-ceil response) which are ultimately responsible 
for driving the antibody response to 6-cell epitopes present, on the same protein from 
which the Th-cell epitope is derived. Thus, the immune response to a protein in an 
individual is heavily influenced by T-cell epitope recognition which is a function of the 

20 peptide binding specificity of that individual's HLA-DR allotype. Therefore, in order to 
identify T-cell epitopes within a protein or peptide in the context of a global population, it 
is desirable to consider the binding properties of as diverse a set of HLA-DR allotypes as 
possible, thus covering as high a percentage of the world population as possible. 

25 An immune response to a therapeutic protein such as the protein which is object of this 
invention, proceeds via the MHC class II peptide presentation pathway. Here exogenous 
proteins are engulfed and processed for presentation in association with MHC class II 
molecules of the DR, DQ or DP type. MHC Class II molecules are expressed by 
professional antigen presenting cells (APCs), such as macrophages and dendritic cells 

30 amongst others. Engagement of a MHC class II peptide complex by a cognate T-cell 
receptor on the surface of the T-cell, together with the cross-binding of certain other co- 
receptors such as the CD4 molecule, can induce an activated state within the T-cell. 
Activation leads to the release of cytokines further activating other lymphocytes such as B 
cells to produce antibodies or activating T killer cells as a full cellular immune response. 
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The ability of a peptide to bind a given MHC class II molecule for presentation on the 
surface of an APC is dependent on a number of factors most notably its primary 
sequence. This will influence both its propensity for proteolytic cleavage and also its 
affinity for binding within the peptide binding cleft of the MHC class II molecule. The 
5 MHC class II / peptide complex on the APC surface presents a binding face to a particular 
T-cell receptor (TCR) able to recognize determinants provided both by exposed residues 
of the peptide and the MHC class II molecule. 

In the art there are procedures for identifying synthetic peptides able to bind MHC class II 
10 molecules (e.g. W098/52976 and WO00/343 17). Such peptides may not function as T- 
cell epitopes in all situations, particularly, in vivo due to the processing pathways or other 
phenomena. T-cell epitope identification is the first step to epitope elimination. The 
identification and removal of potential T-cell epitopes from proteins has been previously 
disclosed. In the art methods have been provided to enable the detection of T-cell epitopes 
15 usually by computational means scanning for recognized sequence motifs in 
experimentally determined T-cell epitopes or alternatively using computational 
techniques to predict MHC class Il-binding peptides and in particular DR-binding 
peptides. 

W098/52976 and WOQO/34317 teach computational threading approaches to identifying 
20 polypeptide sequences with the potential to bind a sub-set of human MHC class II DR 
allotypes. In these teachings, predicted T-cell epitopes are removed by the use of 
judicious amino acid substitution within the primary sequence of the therapeutic antibody 
or non-antibody protein of both non-human and human derivation. 

25 Other techniques exploiting soluble complexes of recombinant MHC molecules in 
combination with synthetic peptides and able to bind to T-cell clones from peripheral 
blood samples from human or experimental animal subjects have been used in the art 
[Kern, F. et al (1998) Nature Medicine 4:975-978; Kwok, W.W. et al (2001) TRENDS in 
Immunology 22: 583-588] and may also be exploited in an epitope identification strategy. 

30 

As depicted above and as consequence thereof, it would be desirable to identify and to 
remove or at least to reduce T-cell epitopes from a given in principal therapeutically 
valuable but originally immunogenic peptide, polypeptide or protein. 
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One of these therapeutically valuable molecules is "keratinocyte growth factor (KGF)". 
KGF is a member of the fibroblast growth factor (FGF) / heparin-binding growth factor 
family of proteins. It is a secreted glycoprotein expressed predominantly in the lung, 
promoting wound healing by stimulating the growth of keratinocytes and other epithelial 

5 cells [Finch et al (1989), Science 24: 752-755; Rubin et al (1989), Proc. Natl. Acad. Sci. 
U.S.A. 86: 802-806]. The mature (processed) form of the glycoprotein comprises 163 
amino acid residues and may be isolated from conditioned media following culture of 
particular cell lines [Rubin et al, (1989) ibid.], or produced using recombinant techniques 
[Ron et al (1993) J. Biol. Chem. 268: 2984-2988]. The protein is of therapeutic value for 

10 the stimulation of epithelial cell growth in a number of significant disease and injury 
repair settings. This disclosure specifically pertains the human KGF protein being the 
mature (processed) form of 163 amino acid residues. 

Others have also provided KGF molecules [e.g. US, 6,008,328; WO90/08771;] including 
modified KGF [Ron et al (1993) ibid; WO9501434], However, such teachings have not 
15 recognized the importance of T-cell epitopes to the immunogenic properties of the protein 
nor have been conceived to directly influence said properties in a specific and controlled 
way according to the scheme of the present invention. 

The amino acid sequence of keratinocyte growth factor (KGF) (depicted as one-letter 

20 code) is as follows: 

MCNDMT PEQMATNVNC S S PERHTRS YD YMEGGDI RVRRLFCRTQWYLRI DKRGKVKGTQEMKNN Y 
NIMEIRTVAVGIVAIKGVESEFYLAMNBCEGKLYAKKECNEDCNFKELILENHYNTYASAKWTHNG 
GEMFVALNQKGIPVRGKKTKKEQKTAHFLPMAIT 

25 However, there is a continued need for keratinocyte growth factor (KGF) analogues with 
enhanced properties. Desired enhancements include alternative schemes and modalities 
for the expression and purification of the said therapeutic, but also and especially, 
improvements in the biological properties of the protein. There is a particular need for 
enhancement of the in vivo characteristics when administered to the human subject. In 

30 this regard, it is highly desired to provide keratinocyte growth factor (KGF) with reduced 
or absent potential to induce an immune response in the human subject. 

SUMMARY AND DESCRIPTION OF THE INVENTION 

The present invention provides for modified forms of "keratinocyte growth factor 
35 (KGF)", in which the immune characteristic is modified by means of reduced or removed 
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numbers of potential T-cell epitopes. The present invention provides for modified forms 
of human keratinocyte growth factor (KGF) with one or more T-cell epitopes removed. 
KGF proteins such as identified from other mammalian or vertebrate sources have in 
common many of the peptide sequences of the present disclosure and have in common 
5 many peptide sequences with substantially the same sequence as those of the disclosed 
listing. Such protein sequences equally therefore fall under the scope of the present 
invention. 

The invention discloses sequences identified within the keratinocyte growth factor 
primary sequence that are potential T-cell epitopes by virtue of MHC class II binding 
10 potential. This disclosure specifically pertains title human KGF protein being the 1 63 
amino acid residues. 

The invention discloses also specific positions within the primary sequence of the 
molecule according to the invention which has to be altered by specific a min o acid 
substitution, addition or deletion without affecting the biological activity in principal. In 
1 5 cases in which the loss of immunogenicity can be achieved only by a simultaneous loss of 
biological activity it is possible to restore said activity by further alterations within the 
amino acid sequence of the protein. 

The invention discloses furthermore methods to produce such modified molecules, above 
all methods to identify said T-cell epitopes which have to be altered in order to reduce or 
20 remove immunogenetic sites. The invention may be applied to any KGF species of 
molecule with substantially the same primary amino acid sequences as those disclosed 
herein and would include therefore KGF molecules derived by genetic engineering means 
or other processes and may not contain 163 amino acid residues. 

The protein according to this invention would expect to display an increased circulation 
25 time within the human subject and would be of particular benefit in chronic or recurring 
disease settings such as is the case for a number of indications for keratinocyte growth 
factor (KGF). The present invention provides for modified forms of KGF proteins that are 
expected to display enhanced properties in vivo. These modified KGF molecules can be 
used in pharmaceutical compositions. 

30 

In summary the invention relates to the following issues: 

• a modified molecule having the biological activity of keratinocyte growth factor 
(KGF) and being substantially non-immunogenic or less immunogenic than any non- 
modified molecule having the same biological activity when used in vivo; 



CONFIRMATION COPY 



WO 02/062842 - 7 - PCT/EP02/01175 

• an accordingly specified molecule, wherein said loss of immunogenicity is achieved by 
removing one or more T-cell epitopes derived from the originally non-modified 
molecule; 

• an accordingly specified molecule, wherein said loss of immunogenicity is achieved by 
reduction in numbers of MHC allotypes able to bind peptides derived from said 
molecule; 

• an accordingly specified molecule, wherein one T-cell epitope is removed; 

• an accordingly specified molecule, wherein said originally present T-cell epitopes are 
MHC class II ligands or peptide sequences which show the ability to stimulate or bind 
T-cells via presentation on class II; 

• an accordingly specified molecule, wherein said peptide sequences are selected from 
the group as depicted in Table 1; 

• an accordingly specified molecule, wherein 1-9 amino acid residues, preferably one 
amino acid residue in any of the originally present T-cell epitopes are altered; 

• an accordingly specified molecule, wherein the alteration of the amino acid residues is 
substitution, addition or deletion of originally present amino acid(s) residue(s) by other 
amino acid residue(s) at specific position(s); 

• an accordingly specified molecule, wherein one or more of the amino acid residue 
substitutions are carried out as indicated in Table 2; 

• an accordingly specified molecule, wherein (additionally) one or more of the amino 
acid residue substitutions are carried out as indicated in Table 3 for the reduction in the 
number of MHC allotypes able to bind peptides derived from said molecule; 

• an accordingly specified molecule, wherein, if necessary, additionally further alteration 
usually by substitution, addition or deletion of specific amino acid(s) is conducted to 
restore biological activity of said molecule; 

• A DNA sequence or molecule which codes for any of said modified molecules as 
specified above and below; 

• a pharmaceutical composition comprising a modified molecule having the biological 
activity of keratinocyte growth factor (KGF) as defined above and / or in the claims, 
optionally together with a pharmaceutical^ acceptable carrier, diluent or excipient; 

• a method for manufacturing a modified molecule having the biological activity of 
keratinocyte growth factor (KGF) as defined in any of the claims of the above-cited 
claims comprising the following steps: (i) determining the amino acid sequence of the 
polypeptide or part thereof; (ii) identifying one or more potential T-cell epitopes within 
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the amino acid sequence of the protein by any method including determination of the 
binding of the peptides to MHC molecules using in vitro or in silico techniques or 
biological assays; (iii) designing new sequence variants with one or more amino acids 
within the identified potential T-ceil epitopes modified in such a way to substantially 

5 reduce or eliminate the activity of the T-cell epitope as determined by the binding of 
the peptides to MHC molecules using in vitro or in silico techniques or biological 
assays; (iv) constructing such sequence variants by recombinant DNA techniques and 
testing said variants in order to identify one or more variants with desirable properties; 
and (v) optionally repeating steps (ii) - (iv); 

10 • an accordingly specified method, wherein step (iii) is carried out by substitution, 

addition or deletion of 1 - 9 amino acid residues in any of the originally present T-cell 
epitopes; 

• an accordingly specified method, wherein the alteration is made with reference to a 
homologues protein sequence and / or in silico modeling techniques; 

15 • an accordingly specified method, wherein step (ii) of above is carried out by the 
following steps: (a) selecting a region of the peptide having a known amino acid 
residue sequence; (b) sequentially sampling overlapping amino acid residue segments 
of predetermined uniform size and constituted by at least three amino acid residues 
from the selected region; (c) calculating MHC Class II molecule binding score for each 

20 said sampled segment by summing assigned values for each hydrophobic amino acid 
residue side chain present in said sampled amino acid residue segment; and (d) 
identifying at least one of said segments suitable for modification, based on the 
calculated MHC Class H molecule binding score for that segment, to change overall 
MHC Class H binding score for the peptide without substantially reducing therapeutic 

25 utility of the peptide; step (c) is preferably carried out by using a Bohm scoring 

function modified to include 12-6 van der Waal's ligand-protein energy repulsive term 
and ligand conformational energy term by (1) providing a first data base of MHC Class 
II molecule models; (2) providing a second data base of allowed peptide backbones for 
said MHC Class II molecule models; (3) selecting a model from said first data base; 

30 (4) selecting an allowed peptide backbone from said second data base; (5) identifying 
amino acid residue side chains present in each sampled segment; (6) detennining the 
binding affinity value for all side chains present in each sampled segment; and 
repeating steps (1) through (5) for each said model and each said backbone; 
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• a 13mer T-cell epitope peptide having a potential MHC class II binding activity and 
created from immunogenetically non-modified keratinocyte growth factor (KGF), 
selected from the group as depicted in Table 1 and its use for the manufacture of KGF 
having substantially no or less immunogenicity than any non-modified molecule with 

5 the same biological activity when used in vivo; 

• a peptide sequence consisting of at least 9 consecutive amino acid residues of a 1 3mer 
T-cell epitope peptide as specified above and its use for the manufacture of KGF 
having substantially no or less immunogenicity than any non-modified molecule with 
the same biological activity when used in vivo; . 

10 

The term "T-cell epitope" means according to the understanding of this invention an 
amino acid sequence which is able to bind MCH n, able to stimulate T-cells and / or also 
to bind (without necessarily measurably activating) T-cells in complex with MHC EL 
The term "peptide" as used herein and in the appended claims, is a compound that 

15 includes two or more amino acids. The amino acids are linked together by a peptide bond 
(defined herein below). There are 20 different naturally occurring amino acids involved 
int eh biological production of peptides, and any number of them may be linked in any 
order to form a peptide chain or ring. The naturally occurring amino acids employed in 
the biological production of peptides all have the L-configuration. Synthetic peptides can 

20 be prepared employing conventional synthetic methods, utilizing L-amino acids, D-amino 
acids, or various combinations of amino acids of the two different configurations. Some 
peptides contain only a few amino acid units. Short peptides, e.g., having less than ten 
amino acid units, are sometimes referred to as "oligopeptides". Other peptides contain a 
large number of amino acid residues, e.g. up to 100 ore more, and are referred to as 

25 "polypeptides". By convention, a "polypeptide" may be considered as any peptide chain 
containing three or more amino acids, whereas a "oligopeptide" is usually considered as a 
particular type of "short" polypeptide. Thus, as used herein, it is understood that any 
reference to a "polypeptide" also includes an oligopeptide. Further, any reference to a 
"peptide" includes polypeptides, oligopeptides, and proteins. Each different arrangement 

30 of amino acids forms different polypeptides or proteins. The number of polypeptides-and 
hence the number of different proteins-that can be formed is practically unlimited. 
"Alpha carbon (Ca) ,r is the carbon atom of the carbon-hydrogen (CH) component that is 
in the peptide chain. A "side chain" is a pendant group to Ca that can comprise a simple 
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or complex group or moiety, having physical dimensions that can vary significantly 
compared to the dimensions of the peptide. 

The invention maybe applied to any keratinocyte growth factor (KGF) species of 
molecule with substantially the same primary amino acid sequences as those disclosed 
5 herein and would include therefore keratinocyte growth factor (KGF) molecules derived 
by genetic engineering means or other processes and may not contain either 1 63 amino 
acid residues. 

Keratinocyte growth factor (KGF) proteins such as identified from other mammalian 
sources have in common many of the peptide sequences of the present disclosure and 
10 have in common many peptide sequences with substantially the same sequence as those 
of the disclosed listing. Such protein sequences equally therefore fall under the scope of 
the present invention. 

The invention is conceived to overcome the practical reality that soluble proteins 
15 introduced into autologous organisms can trigger an immune response resulting in 

development of host antibodies that bind to the soluble protein. One example amongst 

others, is interferon alpha 2 to which a proportion of human patients make antibodies 

despite the fact that this protein is produced endogenously [Russo, D. et al (1996) ibid; 

Stein, R. et al (1988) ibid]. It is likely that the same situation pertains to the therapeutic 
20 use of keratinocyte growth factor (KGF) and the present invention seeks to address this 

by providing keratinocyte growth factor (KGF) proteins with altered propensity to elicit 

an immune response on administration to the human host. 

The general method of the present invention leading to the modified keratinocyte growth 
25 factor (KGF) comprises the following steps: 

(a) determining the amino acid sequence of the polypeptide or part thereof, 

(b) identifying one or more potential T-cell epitopes within the amino acid sequence of 
the protein by any method including determination of the binding of the peptides to MHC 
molecules using in vitro or in silico techniques or biological assays; 

30 (c) designing new sequence variants with one or more amino acids within the identified 
potential T-cell epitopes modified in such a way to substantially reduce or eliminate the 
activity of the T-cell epitope as determined by the binding of the peptides to MHC 
molecules using in vitro or in silico techniques or biological assays. Such sequence 
variants are created in such a way to avoid creation of new potential T-cell epitopes by 
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the sequence variations unless such new potential T-cell epitopes are, in turn, modified in 
such a way to substantially reduce or eliminate the activity of the T-cell epitope; and 
(d) constructing such sequence variants by recombinant DNA techniques and testing said 
variants in order to identify one or more variants with desirable properties according to 
5 well known recombinant techniques. 

The identification of potential T-cell epitopes according to step (b) can be carried out 
according to methods describes previously in the prior art. Suitable methods are 
disclosed in WO 98/59244; WO 98/52976; WO 00/34317 and may preferably be used to 
10 identify binding propensity of keratinocyte growth factor (KGF)-derived peptides to an 
MHC class II molecule. 



15 



25 



30 



Another very efficacious method for identifying T-cell epitopes by calculation is 
described in the EXAMPLE which is a preferred embodiment according to this invention. 



In practice a number of variant keratinocyte growth factor (KGF) proteins will be 
produced and tested for the desired immune and functional characteristic. The variant 
proteins will most preferably be produced by recombinant DNA techniques although 
other procedures including chemical synthesis of keratinocyte growth factor (KGF) 
20 fragments may be contemplated. 

The results of an analysis according to step (b) of the above scheme and pertaining to the 
human keratinocyte growth factor (KGF) protein sequence of 163 amino acid residues is 
presented in Table 1 . 

Table 1: Peptide sequences in human keratinocyte growth factor (KGF) with potential 
human MHC class II binding activity. 



35 



NDMTPEQMATNVN, 
RSYDYMEGGDIRV, 
I RVRRL FCRTQWY , 
QW YLRI DKRGKVK , 
QEMKNNYNIMEIR, 
MEIRTVAVGIVAI , 
VAIKGVESEFYLA, 
FYLAMNKEGKL YA , 
CNFKELILENHYN, 
NHYNT YAS AKWTH , 



DMT PEQMATNVNC , 
YDYHEGGDIRVRR, 
RRLFCRTQWYLRI, 
WYLRI DKRGKVKG , 
NNYNIMEIRTVAV, 
RTVAVGIVAIKGV, 
KGVESEFYLAMNK, 
LAMNKEGKLYAKK, 
KELILENHYNTYA, 
NT YASAKWTHNGG , 



EQMATNVNCSSPE, 
DYMEGGDIRVRRL, 
RLFCRTQWYLRID, 
LRI DKRGKVKGTQ , 
YNIMEIRTVAVGI, 
VAVGIVAIKGVES, 
S E FYLAMNKEGKL , 
GKL YAKKECNE DC , 
ELILENHYNTYAS , 
AKWTHNGGEMFVA, 



TNVNCSSPERHTR, 
GDIRVRRLFCRTQ , 
TQWYLRI DKRGKV , 
GKVKGTQEMKNNY , 
NIMEIRTVAVGIV, 
•VGIVAIKGVESEF, 
E FYLAMNKEGKL Y, 
KL YAKKECNE DCN , 
LILENHYNTYASA, 
GEMFVALNQKGIP, 
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EM F VALNQ KG I PV , FVALNQKGI PVRG , VALNQKGI PVRGK , KG I PVRGKKTKKE , 
I PVRGKKT KKEQK , KTKKEQKT AH FL P 

Peptides are 13mers, amino acids are identified using single letter code. 



5 The results of a design and constructs according to step (c) and (d) of the above scheme 
and pertaining to the modified molecule of this invention is presented in Tables 2 and 3. 
Table 2 : Substitutions leading to the elimination of potential T-cell epitopes of human 
keratinocyte growth factor (KGF) (WT = wild type). 
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Table 3: Additional substitutions leading to the removal of a potential T-cell epitope for 



or more MHC allotypes. 
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The invention relates to keratinocyte growth factor (KGF) analogues in which 
substitutions of at least one amino acid residue have been made at positions resulting in a 
substantial reduction in activity of or elimination of one or more potential T-cell epitopes 
from the protein. One or more amino acid substitutions at particular points within any of 
the potential MHC class EE ligands identified in Table 1 may result in a keratinocyte 
growth factor (KGF) molecule with a reduced immunogenic potential when administered 
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as a therapeutic to the human host. Preferably, amino acid substitutions are made at 
appropriate points within the peptide sequence predicted to achieve substantial reduction 
or elimination of the activity of the T-cell epitope. In practice an appropriate point will 
preferably equate to an amino acid residue binding within one of the hydrophobic pockets 
5 provided within the MHC class II binding groove. 

It is most preferred to alter binding within the first pocket of the cleft at the so-called PI 
or PI anchor position of the peptide. The quality of binding interaction between the PI 
anchor residue of the peptide and the first pocket of the MHC class II binding groove is 

10 recognized as being a major determinant of overall binding affinity for the whole peptide. 
An appropriate substitution at this position of the peptide will be for a residue less readily 
accommodated within the pocket, for example, substitution to a more hydrophilic residue. 
Amino acid residues in the peptide at positions equating to binding within other pocket 
regions within the MHC binding cleft are also considered and fall under the scope of the 

15 present. 

It is understood that single amino acid substitutions within a given potential T-cell epitope 
are the most preferred route by which the epitope may be eliminated. Combinations of 
substitution within a single epitope may be contemplated and for example can be 

20 particularly appropriate where individually defined epitopes are in overlap with each 
other. Moreover, amino acid substitutions either singly within a given epitope or in 
combination within a single epitope may be made at positions not equating to the "pocket 
residues" with respect to the MHC class II binding groove, but at any point within the 
peptide sequence. Substitutions may be made with reference to an homologues structure 

25 or structural method produced using in silico techniques known in the art and may be 
based on known structural features of the molecule according to this invention. All such 
substitutions fall within the scope of the present invention. 

Amino acid substitutions other than within the peptides identified above may be 
30 contemplated particularly when made in combination with substitution(s) made within a 
listed peptide. For example a change may be contemplated to restore structure or 
biological activity of the variant molecule. Such compensatory changes and changes to 
include deletion or addition of particular amino acid residues from the keratinocyte 
growth factor (KGF) polypeptide resulting in a variant with desired activity and in 
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10 



15 



20 



combination with changes in any of the disclosed peptides fall under the scope of the 
present. 

In as far as this invention relates to modified keratinocyte growth factor (KGF), 
compositions containing such modified keratinocyte growth factor (KGF) proteins or 
fragments of modified keratinocyte growth factor (KGF) proteins and related 
compositions should be considered within the scope of the invention. In another aspect, 
the present invention relates to nucleic acids encoding modified keratinocyte growth 
factor (KGF) entities. In a further aspect the present invention relates to methods for 
therapeutic treatment of humans using the modified keratinocyte growth factor (KGF) 
proteins. 

EXAMPLE 

There are a number of factors that play important roles in determining the total structure 
of a protein or polypeptide. First, the peptide bond, i.e., that bond which joins the amino 
acids in the chain together, is a covalent bond. This bond is planar in structure, 
essentially a substituted amide. An "amide" is any of a group of organic compounds 
containing the grouping -CONH-. 

The planar peptide bond linking Coc of adjacent amino acids may be represented as 
depicted below: 



Because the 0=C and the ON atoms lie in a relatively rigid plane, free rotation does not 
occur about these axes. Hence, a plane schematically depicted by the interrupted line is 
sometimes referred to as an "amide" or "peptide plane" plane wherein lie the oxygen (O), 
carbon (C), nitrogen (N), and hydrogen (H) atoms of the peptide backbone. At opposite 
corners of this amide plane are located the Cot atoms. Since there is substantially no 
rotation about the 0=C and C-N atoms in the peptide or amide plane, a polypeptide chain 
thus comprises a series of planar peptide linkages joining the Cce atoms. 
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A second factor that plays an important role in defining the total structure or 
conformation of a polypeptide or protein is the angle of rotation of each amide plane 
about the common Ca linkage. The terms "angle of rotation" and 'torsion angle" are 
hereinafter regarded as equivalent terms. Assuming that the O, C, N, and H atoms remain 

5 in the amide plane (which is usually a valid assumption, although there may be some 
slight deviations from planarity of these atoms for some conformations), these angles of 
rotation define the N and R polypeptide's backbone conformation, i.e., the structure as it 
exists between adjacent residues. These two angles are known as cj> and A set of the 
angles <J>i, v|/i, where the subscript i represents a particular residue of a polypeptide chain, 

10 thus effectively defines the polypeptide secondary structure. The conventions used in 
defining the <|>, \|/ angles, i.e., the reference points at which the amide planes form a zero 
degree angle, and the definition of which angle is <{>, and which angle is y, for a given 
polypeptide, are defined in the literature. See, e.g„ Ramachandran et al. Adv. Prot. Chem. 
23:283-437 (1968), at pages 285-94, which pages are incorporated herein by reference. 

15 The present method can be applied to any protein, and is based in part upon the discovery 
that in humans the primary Pocket 1 anchor position of MHC Class II molecule binding 
grooves has a well designed specificity for particular amino acid side chains. The 
specificity of this pocket is determined by the identity of the amino acid at position 86 of 
the beta chain of the MHC Class II molecule. This site is located at the bottom of Pocket 

20 1 and determines the size of the side chain that can be accommodated by this pocket. 
Marshall, K.W., J. Immunol., 152:4946-4956 (1994). If this residue is a glycine, then all 
hydrophobic aliphatic and aromatic amino acids (hydrophobic aliphatics being: valine, 
leucine, isoleucine, methionine and aromatics being: phenylalanine, tyrosine and 
tryptophan) can be accommodated in the pocket, a preference being for the aromatic side 

25 chains. If this pocket residue is a valine, then the side chain of this amino acid protrudes 
into the pocket and restricts the size of peptide side chains that can be accommodated 
such that only hydrophobic aliphatic side chains can be accommodated. Therefore, in an 
amino acid residue sequence, wherever an amino acid with a hydrophobic aliphatic or 
aromatic side chain is found, there is the potential for a MHC Class II restricted T-cell 

30 epitope to be present. If the side-chain is hydrophobic aliphatic, however, it is 

approximately twice as likely to be associated with a T-cell epitope than an aromatic side 
chain (assuming an approximately even distribution of Pocket 1 types throughout the 
global population). 
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A computational method embodying the present invention profiles the likelihood of 
peptide regions to contain T-cell epitopes as follows: 

(1) The primary sequence of a peptide segment of predetermined length is scanned, and 
all hydrophobic aliphatic and aromatic side chains present are identified. (2)The 

5 hydrophobic aliphatic side chains are assigned a value greater than that for the aromatic 
side chains; preferably about twice the value assigned to the aromatic side chains, e.g., a 
value of 2 for a hydrophobic aliphatic side chain and a value of 1 for an aromatic side 
chain. (3) The values determined to be present are summed for each overlapping amino 
acid residue segment (window) of predetermined uniform length within the peptide, and 

10 the total value for a particular segment (window) is assigned to a single amino acid 
residue at an intermediate position of the segment (window), preferably to a residue at 
about the midpoint of the sampled segment (window). This procedure is repeated for 
each sampled overlapping amino acid residue segment (window). Thus, each amino acid 
residue of the peptide is assigned a value that relates to the likelihood of a T-cell epitope 

15 being present in that particular segment (window). (4) The values calculated and assigned 
as described in Step 3, above, can be plotted against the amino acid coordinates of the 
entire amino acid residue sequence being assessed. (5) All portions of the sequence which 
have a score of a predetermined value, e.g., a value of 1 , are deemed likely to contain a T- 
cell epitope and can be modified, if desired This particular aspect of the present invention 

20 provides a general method by which the regions of peptides likely to contain T-cell 

epitopes can be described. Modifications to the peptide in these regions have the potential 
to modify the MHC Class II binding characteristics. 

According to another aspect of the present invention, T-cell epitopes can be predicted 
with greater accuracy by the use of a more sophisticated computational method which 

25 takes into account the interactions of peptides with models of MHC Class II alleles. The 
computational prediction of T-cell epitopes present within a peptide according to this 
particular aspect contemplates the construction of models of at least 42 MHC Class EL 
alleles based upon the structures of all known MHC Class II molecules and a method for 
the use of these models in the computational identification of T-cell epitopes, the 

30 construction of libraries of peptide backbones for each model in order to allow for the 
known variability in relative peptide backbone alpha carbon (Ccc) positions, the 
construction of libraries of amino-acid side chain conformations for each backbone dock 
with each model for each of the 20 amino-acid alternatives at positions critical for the 
interaction between peptide and MHC Class II molecule, and the use of these libraries of 
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backbones and side-chain conformations in conjunction with a scoring function to select 
the optimum backbone and side-chain conformation for a particular peptide docked with a 
particular MHC Class II molecule and the derivation of a binding score from this 
interaction. 

5 Models of MHC Class II molecules can be derived via homology modeling from a 

number of similar structures found in the Brookhaven Protein Data Bank ( e TDB"). These 
may be made by the use of semi-automatic homology modeling software (Modeller, Sali 
A. & Blundell TL., 1993. J. Mol Biol 234:779-815) which incorporates a simulated 
annealing function, in conjunction with the CHARMm force-field for energy 

10 minimisation (available from Molecular Simulations Inc., San Diego, Ca.). Alternative 
modeling methods can be utilized as well. 

The present method differs significantly from other computational methods which use 
libraries of experimentally derived binding data of each amino-acid alternative at each 
position in the binding groove for a small set of MHC Class II molecules (Marshall, 
15 K.W., et aL, Biomed. Pept Proteins Nucleic Acids, 1(3):157-162) (1995) or yet other 

computational methods which use similar experimental binding data in order to define the 
binding characteristics of particular types of binding pockets within the groove, again 
using a relatively small subset of MHC Class II molecules, and then 'mixing and 
matching' pocket types from this pocket library to artificially create further 'virtual' 
20 MHC Class H molecules (Sturnioio T. f et aL, Nat Biotech, 17(6): 555-561 (1999). Both 
prior methods suffer the major disadvantage that, due to the complexity of the assays and 
the need to synthesize large numbers of peptide variants, only a small number of MHC 
Class II molecules can be experimentally scanned. Therefore the first prior method can 
only make predictions for a small number of MHC Class II molecules. The second prior 
25 method also makes the assumption that a pocket lined with similar amino-acids in one 
molecule will have the same binding characteristics when in the context of a different 
Class II allele and suffers further disadvantages in that only those MHC Class II 
molecules can be 'virtually* created which contain pockets contained within the pocket 
library. Using the modeling approach described herein, the structure of any number and 
30 type of MHC Class II molecules can be deduced, therefore alleles can be specifically 
selected to be representative of the global population. In addition, the number of MHC 
Class II molecules scanned can be increased by making further models further than 
having to generate additional data via complex experimentation. 
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The use of a backbone library allows for variation in the positions of the Ccc atoms of the 
various peptides being scanned when docked with particular MHC Class II molecules. 
This is again in contrast to the alternative prior computational methods described above 
which rely on the use of simplified peptide backbones for scanning amino-acid binding in 
5 ' particular pockets. These simplified backbones are not likely to be representative of 
backbone conformations found in 'real' peptides leading to inaccuracies in prediction of 
peptide binding. The present backbone library is created by superposing the backbones of 
all peptides bound to MHC Class II molecules found within the Protein Data Bank and 
noting the root mean square (RMS) deviation between the Ca atoms of each'of the eleven 

10 amino-acids located within the binding groove. While this library can be derived from a 
small number of suitable available mouse and human structures (currently 13), in order to 
allow for the possibility of even greater variability, the RMS figure for each C"-D 
position is increased by 50%. The average Ca position of each amino-acid is then 
determined and a sphere drawn around this point whose radius equals the RMS deviation 

15 at that position plus 50%. This sphere represents all allowed Ca positions. Working from 
the Ca with the least RMS deviation (that of the amino-acid in Pocket 1 as mentioned 
above, equivalent to Position 2 of the 1 1 residues in the binding groove), the sphere is 
three-dimensionally gridded, and each vertex within the grid is then used as a possible 
location for a Ca of that amino-acid. The subsequent amide plane, corresponding to the 

20 peptide bond to the subsequent amino-acid is grafted onto each of these Cas and the <|> 
and \j/ angles are rotated step-wise at set intervals in order to position the subsequent Ca. 
If the subsequent Ca falls within the € sphere of allowed positions' for this Ca than the 
orientation of the dipeptide is accepted, whereas if it falls outside the sphere then the 
dipeptide is rejected. This process is then repeated for each of the subsequent Ca 

25 positions, such that the peptide grows from the Pocket 1 Ca 'seed', until all nine 

subsequent Cos have been positioned from all possible permutations of the preceding ' 
Cas. The process is then repeated once more for the single Ca preceding pocket 1 to 
create a library of backbone Ca positions located within the binding groove. The number 
of backbones generated is dependent upon several factors: The size of the 'spheres of 

30 allowed positions'; the fineness of the gridding of the 'primary sphere' at the Pocket 1 
position; the fineness of the step-wise rotation of the § and \{/ angles used to position 
subsequent Cos. Using this process, a large library of backbones can be created. The 
larger the backbone library, the more likely it will be that the optimum fit will be found 
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for a particular peptide within the binding groove of an MHC Class II molecule. 
Inasmuch as all backbones will not be suitable for docking with all the models of MHC 
Class II molecules due to clashes with amino-acids of the binding domains, for each allele 
a subset of the library is created comprising backbones which can be accommodated by 

5 that allele. The use of the backbone library, in conjunction with the models of MHC Class 
II molecules creates an exhaustive database consisting of allowed side chain 
conformations for each amino-acid in each position of the binding groove for each MHC 
Class II molecule docked with each allowed backbone. This data set is generated using a 
simple steric overlap function where a MHC Class II molecule is docked with a backbone 

10 and an amino-acid side chain is grafted onto the backbone at the desired position. Each of 
the rotatable bonds of the side chain is rotated step-wise at set intervals and the resultant 
positions of the atoms dependent upon that bond noted. The interaction of the atom with 
atoms of side-chains of the binding groove is noted and positions are either accepted or 
rejected according to the following criteria: The sum total of the overlap of all atoms so 

15 far positioned must not exceed a pre-determined value. Thus the stringency of the 

conformational search is a function of the interval used in the step-wise rotation of the 
bond and the pre-determined limit for the total overlap. This latter value can be small if it 
is known that a particular pocket is rigid, however the stringency can be relaxed if the 
positions of pocket side-chains are known to be relatively flexible. Thus allowances can 

20 be made to imitate variations in flexibility within pockets of the binding groove. This 
conformational search is then repeated for every amino-acid at every position of each 
backbone when docked with each of the MHC Class II molecules to create the exhaustive 
database of side-chain conformations. 

A suitable mathematical expression is used to estimate the energy of binding between 
25 models of MHC Class EL molecules in conjunction with peptide ligand conformations 
which have to be empirically derived by scanning the large database of backbone/side- 
chain conformations described above. Thus a protein is scanned for potential T-cell 
epitopes by subjecting each possible peptide of length varying between 9 and 20 amino- 
acids (although the length is kept constant for each scan) to the following computations: 
30 An MHC Class II molecule is selected together with a peptide backbone allowed for that 
molecule and the side-chains corresponding to the desired peptide sequence are grafted 
on. Atom identity and interatomic distance data relating to a particular side-chain at a 
particular position on the backbone are collected for each allowed conformation of that 
amino-acid (obtained from the database described above). This is repeated for each side- 
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chain along the backbone and peptide scores derived using a scoring function. The best 
score for that backbone is retained and the process repeated for each allowed backbone 
for the selected model. The scores from all allowed backbones are compared and the 
highest score is deemed to be the peptide score for the desired peptide in that MHC Class 
5 II model. This process is then repeated for each model with every possible peptide 
derived from the protein being scanned, and the scores for peptides versus models are 
displayed. 

In the context of the present invention, each ligand presented for the binding affinity 
calculation is an amino-acid segment selected from a peptide or protein as discussed 

10 above. Thus, the ligand is a selected stretch of amino acids about 9 to 20 amino acids in 
length derived from a peptide, polypeptide or protein of known sequence. The terms 
"amino acids" and "residues" are hereinafter regarded as equivalent terms. The ligand, in 
the form of the consecutive amino acids of the peptide to be examined grafted onto a 
backbone from the backbone library, is positioned in the binding cleft of an MHC Class II 

15 molecule from the MHC Class II molecule model library via the coordinates of the C"- 
□ atoms of the peptide backbone and an allowed conformation for each side-chain is 
selected from the database of allowed conformations. The relevant atom identities and 
interatomic distances are also retrieved from this database and used to calculate the 
peptide binding score. Ligands with a high binding affinity for the MHC Class EE binding 

20 pocket are flagged as candidates for site-directed mutagenesis. Amino-acid substitutions 
are made in the flagged ligand (and hence in the protein of interest) which is then retested 
using the scoring function in order to determine changes which reduce the binding affinity 
below a predetermined threshold value. These changes can then be incorporated into the 
protein of interest to remove T-cell epitopes. Binding between the peptide ligand and the 

25 binding groove of MHC Class II molecules involves non-covalent interactions including, 
but not limited to: hydrogen bonds, electrostatic interactions, hydrophobic (lipophilic) 
interactions and Van der Walls interactions. These are included in the peptide scoring 
function as described in detail below. It should be understood that a hydrogen bond is a 
non-covalent bond which can be formed between polar or charged groups and consists of 

30 a hydrogen atom shared by two other atoms. The hydrogen of the hydrogen donor has a 
positive charge where the hydrogen acceptor has a partial negative charge. For the 
purposes of peptide/protein interactions, hydrogen bond donors may be either nitrogens 
with hydrogen attached or hydrogens attached to oxygen or nitrogen. Hydrogen bond 
acceptor atoms may be oxygens not attached to hydrogen, nitrogens with no hydrogens 
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attached and one or two connections, or sulphurs with only one connection. Certain 
atoms, such as oxygens attached to hydrogens or imine nitrogens (e.g. C=NH) may be 
both hydrogen acceptors or donors. Hydrogen bond energies range from 3 to 7 Kcal/mol 
and are much stronger than Van der Waal's bonds, but weaker than covalent bonds. 
5 Hydrogen bonds are also highly directional and are at their strongest when the donor 
atom, hydrogen atom and acceptor atom are co-linear. Electrostatic bonds are formed 
between oppositely charged ion pairs and the strength of the interaction is inversely 
proportional to the square of the distance between the atoms according to Coulomb's law. 
The optimal distance between ion pairs is about 2.8A. In protein/peptide interactions, 

10 electrostatic bonds may be formed between arginine, histidine or lysine and aspartate or 
glutamate. The strength of the bond will depend upon the pECa of the ionizing group and 
the dielectric constant of the medium although they are approximately similar in strength 
to hydrogen bonds. Lipophilic interactions are favorable hydrophobic-hydrophobic 
contacts that occur between he protein and peptide ligand. Usually, these will occur 

15 between hydrophobic amino acid side chains of the peptide buried within the pockets of 
the binding groove such that they are not exposed to solvent. Exposure of the 
hydrophobic residues to solvent is highly unfavorable since the surrounding solvent 
molecules are forced to hydrogen bond with each other forming cage-like clathrate 
structures. The resultant decrease in entropy is highly unfavorable. Lipophilic atoms may 

20 be sulphurs which are neither polar nor hydrogen acceptors and carbon atoms which are 
not polar. Van der WaaPs bonds are non-specific forces found between atoms which are 
3-4A apart. They are weaker and less specific than hydrogen and electrostatic bonds. 
The distribution of electronic charge around an atom changes with time and, at any 
instant, the charge distribution is not symmetric. This transient asymmetry in electronic 

25 charge induces a similar asymmetry in neighboring atoms. The resultant attractive forces 
between atoms reaches a maximum at the Van der Waal's contact distance but dimin ishes 
very rapidly at about 1 A to about 2A. Conversely, as atoms become separated by less 
than the contact distance, increasingly strong repulsive forces become dominant as the 
outer electron clouds of the atoms overlap. Although the attractive forces are relatively 

30 weak compared to electrostatic and hydrogen bonds (about 0.6 Kcal/mol), the repulsive 
forces in particular may be very important in determining whether a peptide ligand may 
bind successfully to a protein. 

In one embodiment, the Bohm scoring function (SCORE1 approach) is used to estimate 
the binding constant. (Bohm, H.J., 7. Comput Aided Mol Des, 7 8(3):243-256 (1994) 
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which is hereby incorporated in its entirety). In another embodiment, the scoring function 
(SCORE2 approach) is used to estimate the binding affinities as an indicator of a ligand 
containing a T-cell epitope (Bohm, H.J., J. Comput Aided Mol Des. t 12(4):309-323 
(1998) which is hereby incorporated in its entirety). However, the Bohm scoring 
5 functions as described in the above references are used to estimate the binding affinity of 
a ligand to a protein where it is already known that the ligand successfully binds to the 
protein and the protein/ligand complex has had its structure solved, the solved structure 
being present in the Protein Data Bank ( iC PDB yr ). Therefore, the scoring function has 
been developed with the benefit of known positive binding data. In order to allow for 

10 disc riminati on between positive and negative binders, a repulsion term must be added to 
the equation. In addition, a more satisfactory estimate of binding energy is achieved by 
computing the lipophilic interactions in a pairwise manner rather than using the area 
based energy term of the above Bohm functions. Therefore, in a preferred embodiment, 
the binding energy is estimated using a modified Bohm scoring function. In the modified 

15 Bohm scoring function, the binding energy between protein and ligand (AGbind) is 

estimated considering the following parameters: The reduction of binding energy due to 
the overall loss of translational and rotational entropy of the ligand (AGo); contributions 
from ideal hydrogen bonds (AGhb) where at least one partner is neutral; contributions 
from unperturbed ionic interactions (AGiomc); lipophilic interactions between lipophilic 

20 ligand atoms and lipophilic acceptor atoms (AGu P o); the loss of binding energy due to the 
freezing of internal degrees of freedom in the ligand, i.e., the freedom of rotation about 
each C-C bond is reduced (AG ro t); the energy of the interaction between the protein and 
ligand (E V dw)- Consideration of these terms gives equation 1 : 
(AGbindM AG 0 )+( AGhbxN hb )+( AGionicxN ion i C )+( AQipoxN lipo )-K AG rot +N rot )+(E Vd w). 

25 Where N is the number of qualifying interactions for a specific term and, in one 

embodiment, AGo, AG hb , AGi on i C , AGiipo and AG ro t are constants which are given the 
values: 5.4, -4.7, -4.7, -0.17, and 1.4, respectively. 
The term Nhb is calculated according to equation 2 : 

Nhb = Zh-bondsf(AR, Aa) X f(Nneighb) * fpcs 

30 f(AR, Aa) is a penalty function which accounts for large deviations of hydrogen bonds 
from ideality and is calculated according to equation 3 : 
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f(AR, A-D) - fl(AR) x f2(Aa) 

Where: fl(AR) = 1 if AR <= TOL 

or = 1 - (AR - TOL)/0.4 if AR <= 0.4 + TOL 

or =0if AR>0.4 + TOL 
5 And: £2(Aa) = 1 if Aa <30° 

or = l-( Aa - 30)/50 if Aa <=80° 

or =0 if Aa >80° 
TOL is the tolerated deviation in hydrogen bond length = 0.25 A 
AR is the deviation of the H-O/N hydrogen bond length from the ideal value = 1.9 A 
10 Aa is the deviation of the hydrogen bond angle Z n/o-h..o/n fxom its idealized value of 
180° 

f(N n eighb) distinguishes between concave and convex parts of a protein surface and 
therefore assigns greater weight to polar interactions found in pockets rather than those 
found at the protein surface. This function is calculated according to equation 4 below: 
15 f(N nc ighb) = (Nneighb/Nneighb.0) a where a = 0.5 

Nneighb is the number of non-hydrogen protein atoms that are closer than 5 A to any given 

protein atom. 

Nncighb.o is a constant = 25 

f pcs is a function which allows for the polar contact surface area per hydrogen bond and 
20 therefore distinguishes between strong and weak hydrogen bonds and its value is 
determined according to the following criteria: 

B when Apoiar/NHB < 10 A 2 
or f pcs = 1 when A po \ar^uB > 10 A 2 

Apoiar is the size of the polar protein-iigand contact surface 
25 Nhb is the number of hydrogen bonds 
6 is a constant whose value =1.2 

For the implementation of the modified Bohm scoring function, the contributions from 
ionic interactions, AGi on ic, are computed in a similar fashion to those from hydrogen 
bonds described above since the same geometry dependency is assumed. 
30 The term Nu p0 is calculated according to equation 5 below: 
Nup 0 = SiLfCriO 

f(r lL ) is calculated for all lipophilic ligand atoms, 1, and all lipophilic protein atoms, L, 
according to the following criteria: 



CONFIRMATION COPY 



WO 02/062842 - 26 - PCT/EP02/01175 

f(r, L ) =1 when r, L <= Rlf(r JL ) =(n L - R1)/(R2-R1) when R2 <r iL > Rl 
f(n L ) =0 when r, L >= R2 
Where: Rl=r, vdw + r L vdw + 0.5 
andR2 = Rl +3.0 
5 and ri vdw is the Van der Waal's radius of atom 1 
and rL Vdw is the Van der Waal's radius of atom L 

The term N rot is the number of rotable bonds of the amino acid side chain and is taken to 
be the number of acyclic sp 3 - sp 3 and sp 3 - sp 2 bonds. Rotations of terminal -CH3 or - 
NH3 are not taken into account 

10 The final term, Evdw, is calculated according to equation 6 below: 
Evdw = s l e 2 ((r l vdw +r 2 vdw ) l2 /r 12 - (n vdw +r 2 vdw ) 6 /r 6 ), where: 
81 and e 2 are constants dependant upon atom identity 

n vdw +r 2 vdw are the Van der Waal's atomic radii 
r is the distance between a pair of atoms. 

15 With regard to Equation 6, in one embodiment, the constants ei and s 2 are given the atom 
values: C: 0.245, N: 0.283, O: 0.316, S: 0.316, respectively (i.e. for atoms of Carbon, 
Nitrogen, Oxygen and Sulphur, respectively).. With regards to equations 5 and 6, the Van 
der Waal's radii are given the atom values C: 1.85, N: 1.75, O: 1.60, S: 2.00A. 
It should be understood that all predetermined values and constants given in the equations 

20 above are determined within the constraints of current understandings of protein ligand 
interactions with particular regard to the type of computation being undertaken herein. 
Therefore, it is possible that, as this scoring function is refined further, these values and 
constants may change hence any suitable numerical value which gives the desired results 
in terms of estimating the binding energy of a protein to a ligand may be used and hence 

25 fall within the scope of the present invention. 

As described above, the scoring function is applied to data extracted from the database of 
side-chain conformations, atom identities, and interatomic distances. For the purposes of 
the present description, the number of MHC Class II molecules included in this database 
is 42 models plus four solved structures. It should be apparent from the above 

30 descriptions that the modular nature of the construction of the computational method of 
the present invention means that new models can simply be added and scanned with the 
peptide backbone library and side-chain conformational search function to create 
additional data sets which can be processed by the peptide scoring function as described 
above. This allows for the repertoire of scanned MHC Class II molecules to easily be 
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increased, or structures and associated data to be replaced if data are available to create 
more accurate models of the existing alleles. 

The present prediction method can be calibrated against a data set comprising a large 
number of peptides whose affinity for various MHC Class II molecules has previously 
5 been experimentally determined. By comparison of calculated versus experimental data, 
a cut of value can be determined above which it is known that all experimentally 
determined T-cell epitopes are correctly predicted. 

It should be understood that, although the above scoring function is relatively simple 
compared to some sophisticated methodologies that are available, the calculations are 

10 performed extremely rapidly. It should also be understood that the objective is not to 
calculate the true binding energy per se for each peptide docked in the binding groove of 
a selected MHC Class II protein. The underlying objective is to obtain comparative 
binding energy data as an aid to predicting the location of T-cell epitopes based on the 
primary structure (i.e. amino acid sequence) of a selected protein. A relatively high 

15 binding energy or a binding energy above a selected threshold value would suggest the 
presence of a T-cell epitope in the ligand. The ligand may then be subjected to at least 
one round of amino-acid substitution and the binding energy recalculated. Due to the 
rapid nature of the calculations, these manipulations of the peptide sequence can be 
performed interactively within the program's user interface on cost-effectively available 

20 computer hardware. Major investment in computer hardware is thus not required. 

It would be apparent to one skilled in the art that other available software could be used 
for the same purposes. In particular, more sophisticated software which is capable of 
docking ligands into protein binding-sites may be used in conjunction with energy 
minimization. Examples of docking software are: DOCK (Kuntz et ai 7 J. Mol Biol, 

25 161:269-288 (1982)), LUDI (Bohm, HJ., J. Comput Aided Mol Des., 8:623-632 (1994)) 
and FLEXX (Rarey M., et al, ISMB, 3:300-308 (1995)). Examples of molecular 
modeling and manipulation software include: AMBER (Tripos) and CHARMm 
(Molecular Simulations Inc.). The use of these computational methods would severely 
limit the throughput of the method of this invention due to the lengths of processing time 

30 required to make the necessary calculations. However, it is feasible that such methods 
could be used as a 'secondary screen' to obtain more accurate calculations of binding 
energy for peptides which are found to be 'positive binders' via the method of the present 
invention. The limitation of processing time for sophisticated molecular mechanic or 
molecular dynamic calculations is one which is defined both by the design of the software 
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which makes these calculations and the current technology limitations of computer 
hardware. It may be anticipated that, in the future, with the writing of more efficient code 
and the continuing increases in speed of computer processors, it may become feasible to 
make such calculations within a more manageable time-frame. Further information on 

5 energy functions applied to macromolecules and consideration of the various interactions 
that take place within a folded protein structure can be found in; Brooks, B.R., et aL, J. 
Comput. Chern., 4:187-217 (1983) and further information concerning general protein- 
ligand interactions can be found in: Dauber-Osguthorpe et aL, Proteins4(l):3 1-47(1 988), 
which are incorporated herein by reference in their entirety. Useful background 

10 information can also be found, for example, in Fasman, G.D., ed., Prediction of Protein 
Structure and the Principles of Protein Conformation, Plenum Press, New York, ISBN: 
0-306 4313-9. 
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Patent Claims 

1 . A modified molecule having the biological activity of keratinocyte growth factor 
(KGF) and being substantially non-immunogenic or less immunogenic than any 
non-modified molecule having the same biological activity when used in vivo. 

2. A molecule according to claim 1, wherein said loss of immunogenicity is achieved 
by removing one or more T-cell epitopes derived from the originally non-modified 
molecule. 

3. A molecule according to claim 1 or 2, wherein said loss of immunogenicity is 
achieved by reduction in numbers of MHC allotypes able to bind peptides derived 
from said molecule. 

4. A molecule according to claim 2 or 3, wherein one T-cell epitope is removed. 

5. A molecule according to any of the claims 2-4, wherein said originally present T- 
cell epitopes are MHC class II ligands or peptide sequences which show the ability 
to stimulate or bind T-cells via presentation on class II. 

6. A molecule according to claim 5, wherein said peptide sequences are selected from 
the group as depicted in Table 1. 

7. A molecule according to any of the claims 2-6, wherein 1-9 amino acid residues 
in any of the originally present T-cell epitopes are altered. 

8. A molecule according to claim 7, wherein one amino acid residue is altered. 

9. A molecule according to claim 7 or 8, wherein the alteration of the amino acid 
residues is substitution of originally present amino acid(s) residue(s) by other amino 
acid residue(s) at specific position(s). 

10. A molecule according to claim 9, wherein one or more of the amino acid residue 
substitutions are carried out as indicated in Table 2. 
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11. A molecule according to claim 10, wherein additionally one or more of the amino 
acid residue substitutions are carried out as indicated in Table 3 for the reduction in 
the number of MHC allotypes able to bind peptides derived from said molecule. 

5 12. A molecule according to claim 9, wherein one or more amino acid substitutions are 
carried as indicated in Table 3. 



13. A molecule according to claim 7 or 8, wherein the alteration of the amino acid 
residues is deletion of originally present amino acid(s) residue(s) at specific 

10 position(s). 

14. A molecule according to claim 7 or 8, wherein the alteration of the amino acid 
residues is addition of amino acid(s) at specific position(s) to those originally 
present. 

15 

15. A molecule according to any of the claims 7 to 14, wherein additionally further 
alteration is conducted to restore biological activity of said molecule. 

16. A molecule according to claim 15, wherein the additional further alteration is 
20 substitution, addition or deletion of specific amino acid(s). 

17. A modified molecule according to any of the claims 7-16, wherein the amino acid 
alteration is made with reference to an homologous protein sequence. 

25 18. A modified molecule according to any of the claims 7-16, wherein the amino acid 
alteration is made with reference to in silico modeling techniques. 

19. A DNA sequence coding for a modified keratinocyte growth factor (KGF) of any of 
the claims 1 — 18. 

30 

20. A pharmaceutical composition comprising a modified molecule having the 
biological activity of keratinocyte growth factor (KGF) as defined in any of the 
above-cited claims, optionally together with a pharmaceutical^ acceptable carrier, 
diluent or excipient. 
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21 . A method for manufacturing a modified molecule having the biological activity of 
keratinocyte growth factor (KGF) as defined in any of the claims of the above-cited 
claims comprising the following steps: 

(i) determining the amino acid sequence of the polypeptide or part thereof. 
5 (ii) identifying one or more potential T-cell epitopes within the amino acid sequence 

of the protein by any method including determination of the binding of the peptides 
to MHC molecules using in vitro or in silico techniques or biological assays; 

(iii) designing new sequence variants with one or more amino acids within the 
identified potential T-cell epitopes modified in such a way to substantially reduce or 

10 eliminate the activity of the T-cell epitope as determined by the binding of the 

peptides to MHC molecules using in vitro or in silico techniques or biological 
assays, or by binding of peptide-MHC complexes to T-cells; 

(iv) constructing such sequence variants by recombinant DNA techniques and 
testing said variants in order to identify one or more variants with desirable 

15 properties; and 

(v) optionally repeating steps (ii) - (iv). 

22. A method of claim 2 1 , wherein step (iii) is carried out by substitution, addition or 
deletion of 1 - 9 amino acid residues in any of the originally present T-cell epitopes. 

20 

23 . A method of claim 22, wherein the alteration is made with reference to a 
homologues protein sequence and / or in silico modeling techniques. 

24. A method of any of the claims 21-23, wherein step (ii) is carried out by the 

25 following steps: (a) selecting a region of the peptide having a known amino acid 

residue sequence; (b) sequentially sampling overlapping amino acid residue 
segments of predetermined uniform size and constituted by at least three amino acid 
residues from the selected region; (c) calculating MHC Class II molecule binding 
score for each said sampled segment by summing assigned values for each 

30 hydrophobic amino acid residue side chain present in said sampled amino acid 

residue segment; and (d) identifying at least one of said segments suitable for 
modification, based on the calculated MHC Class II molecule binding score for that 
segment, to change overall MHC Class II binding score for the peptide without 
substantially reducing therapeutic utility of the peptide. 
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25. A method of claim 24, wherein step (c) is carried out by using a Bohm scoring 
function modified to include 12-6 van der Waal's ligand-protein energy repulsive 
term and ligand conformational energy term by (1) providing a first data base of 
MHC Class II molecule models; (2) providing a second data base of allowed 

5 peptide backbones for said MHC Class II molecule models; (3) selecting a model 

from said first data base; (4) selecting an allowed peptide backbone from said 
second data base; (5) identifying amino acid residue side chains present in each 
sampled segment; (6) determining the binding affinity value for all side chains 
present in each sampled segment; and repeating steps (1) through (5) for each said 

10 model and each said backbone. 

26. A 1 3mer T-cell epitope peptide having a potential MHC class II binding activity 
and created from immunogenically non-modified keratinocyte growth factor (KGF), 
selected from the group as depicted in Table 1. 

15 

27. A peptide sequence consisting of at least 9 consecutive amino acid residues of a 
13mer T-cell epitope peptide according to claim 26. 

28. Use of a 1 3mer T-cell epitope peptide according to claim 26 for the manufacture of 
20 keratinocyte growth factor (KGF) having substantially no or less immunogenicity 

than any non-modified molecule with the same biological activity when used in 
vivo. 

29. Use of a peptide sequence according to claim 27 for the manufacture of 

25 keratinocyte growth factor (KGF) having substantially no or less immunogenicity 

than any non-modified molecule with the same biological activity when used in 
vivo. 



CONFIRMATION COPY 



INTERNATIONAL SEARCH REPORT 



Inti " mal Application No 

PCT7EP 02/01175 



A. CLASSIFICATION OF SUBJECT MATTER 

IPC 7 C07K14/50 A61K38/18 



C07K7/08 



According to International Patent Classification (PC) or to both national classification and IPC 



B. FIELDS SEARCHED 



Minimum documentation searched (classification system followed by classification symbols) 

IPC 7 C12N C07K A61K 



Documentation searched other than minimum documentation to the extent that such documents are included in the fields searched 



Electronic data base consulted during the international search (name of data base and, where practical, search terms used) 

BIOSIS, EPO-Internal , WPI Data, EMBL, CHEM ABS Data, MEDLINE 



C. DOCUMENTS CONSIDERED TO BE RELEVANT 



Category ° Citation of document, with Indication, where appropriate, of the relevant passages 



Relevant to claim No 



US 5 858 977 A (AMGEN INC ; AUKERMAN 

SHARON LEA ET AL) 

12 January 1999 (1999-01-12) 

"Method of treating diabetes nielli tus 

using keratinocyte growth factor" 

column 4, line 65 -column 5, line 55 

WO 96 11952 A (AMGEN INC ;HSU ERIC W (US); 
KENNEY WILLIAM C (US); TRESSEL TIM (US) 
25 April 1996 (1996-04-25) 
"Method for purifying keratinocyte growth 
factors " 

* SEQ ID NO: 33 and 34 * 
figure 8; example 1 
page 6, line 33 -page 8, line 2 

-/— 



1-20 



1-19 



m 



Further documents are listed In the continuation of box C. 



Patent family members are listed in annex. 



• Special categories of cited documents : 

•A" document defining the general state of the art which is not 

considered to be of particular relevance 
*E* earlier document but published on or after the international 

filing date 

V document which may throw doubts on priority ctaim(s) or 
which is cited to establish the publication date of another 
citation or other special reason (as specified) 

"O" document referring to an oral disclosure, use, exhibition or 
other means 

"P" document published prior to the international fifing date but 
later than the priority date claimed 



T" later document published after the international filing date 
or priority date and not In conflict with the application but 
cited to understand the principle or theory underlying the 
invention 

■X' document of parti cuter relevance; the claimed Invention 
cannot be considered novel or cannot be considered to 
involve an mventive step when the document is taken alone 

"V document of particular relevance; the claimed invention 

cannot be considered to involve an inventive step when the 
document Is combined with one or more other such docu- 
ments, such combination being obvious to a person skilled 
In the art. 

*&* document member of the same patent family 



Date of the actual completion of the international search 



26 June 2002 



Date of maffing of the international search report 



10/07/2002 



Name and mailing address of the ISA 

European Patent Office. P.B. 5818 Patentlaan 2 
NL - 2280 HV Rijswflk 
TeL (+31-70) 340-2040. Tx. 31 651 epo ni. 
Fax: (+31-70) 340-3016 



Authorized officer 



Niebuhr-Ebel, K 



F<rm PCT/1SA/210 (second sheet) (July 1992) 



INTERNATIONAL SEARCH REPORT 


Inte ' nal Application No 

PCT7EP 02/01175 


^Continuation) DOCUMENTS CONSIDERED TO BE RELEVANT 


Category* 


Citation of document, with indication,whefe appropriate, of the relevant passages 


Relevant to claim No. 


X 


WO 96 11949 A (AMGEN INC ;CHEN BAO LU 
(US); HSU ERIC W (US); KENNEY WILLIAM C 
(US) 25 April 1996 (1996-04-25) 
"Analogs of keratinocyte growth factor" 
page 6, line 18 -page 8, line 8 




1-20 


X 


BARE LANCE A ET AL: "Effect of cysteine 
substitutions on the mitogenic activity 
and stability of recombinant human 
keratinocyte growth factor." 

n t r\r*i i rur /* A 1 Akirv D TAD UVC T f* A 1 PCCCAPP14 

BIOCHEMICAL AND BtOrHYoILAL KtotAK^H 

COMMUNICATIONS, 

vol. 205, no. 1, 

30 November 1994 (1994-11-30), pages 
872-879, XP001084037 
ISSN: 0006-29 IX 

page 875, line 1 -page 876, paragraph l, 
figures 1,2 




1-19 


X 
Y 


WO 00 34317 A (ADAIR FIONA SUZANNE ;CARR 
FRANCIS JOSEPH (GB); HAMILTON ANITA ANNE) 
15 June 2000 (2000-06-15) 
"Modifying protein immunogenicity" 
page 3, line 23 -page 13, line 13 




21-25 
26-29 


X 
Y 


WO 98 52976 A (BI0VATI0N LTD ;CARR FRANCIS 
J (6B)) 26 November 1998 (1998-11-26) 
"Method for the production of 
non-immunogenic proteins" 
page 3, line 1 - line 28 




21-25 
26-29 


A 
Y 


EP 0 619 370 A (AMGEN INC) 
12 October 1994 (1994-10-12) 
"Keratinocyte growth factor (KGF) for use 
in methods of therapeutic treatment for 
the human or animal body" 
the whole document 




1-29 
26-29 



Form PCT/1SA/210 (continuation of second shoot) (July 1932) 



INTERNATIONAL SEARCH REPORT 



rational application No. 

PCT/EP 02/01175 



Box I Observations where certain claims were found unsearchable (Continu ation of item 1 of first sheet) 

This international Search Report has not been established in respect of certain claims under Article 17(2)(a) for the following reasons: 

1 * ^ because they relate to subject matter not required to be searched by this Authority, namely: 



H claims nos- 1" 5 > 6 ~ 29 ( Partially) 

because the*y relate to parts of the International Application that do not comply with the prescribed requirements to such 
an extent that no meaningful International Search can be carried out, specifically: 

see FURTHER INFORMATION sheet PCT/ISA/210 



3 * ^ became they are dependent claims and are not drafted in accordance with the second and third sentences of Rule 6.4(a). 

Box II Observations where unity of invention is lacking (Continuation of item 2 of first sheet) 

This international Searching Authority found multiple inventions In this international application, as follows: 



1 . I — I As all required additional search fees were timely paid by the applicant, this International Search Report covers all 
I — ' searchable claims. 



2. n As ail searchable claims could be searched without effort justifying an additional fee, this Authority did not invite payment 
— of any additional fee. 



3 1 — I As only some of the required additional search fees were timely paid by the applicant this Internationa] Search Report 
1 — ' covers only those claims for which fees were paid, specifically claims Nos.: 



4 I I No required additional search fees were timely paid by the applicant C 
— restricted to the invention first mentioned in the claims; it is covered by 



Consequently, this international Search Report is 
claims Nos.: 



Remark on Protest Q The additional search fees were accompanied by the applicant's protest 

[ [ Mo protest accompanied the payment of addBtional search fees. 



Form PCT/iSA/210 (continuation of first sheet (1)) (July 1998) 



INTERNATIONAL SEARCH REPORT 



Internationa) Application No. PCT£P 02 A1175 



FURTHER INFORMATION CONTINUED FROM PCT/ISA/ 210 



Continuation of Box 1.2 

Claims Nos.: 1-5, 6-29 (partially) 



Claims 1-5 refer to a "modified molecule having the biological activity 
of keratinocyte growth factor (KGF)...". These molecules are of 
unspecified constitution or structure, i.e. no true technical 
characterization of these compounds is given. No search can be carried 
out on the basis of the vague definition "substantially non-immunogenic 
or less immunogenic", which is, in fact, a mere recitation of the result 
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international preliminary examination (Rule 66.1(e) PCT). The applicant 
is advised that the EPO policy when acting as an International 
Preliminary Examining Authority is normally not to carry out a 
preliminary examination on matter which has not been searched. This is 
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